4 



L Number 


Hits 


Search Text 


DB 


Time stamp 


19 


10 


(("6441832") or ("6195689") or ("6247052") 


US PAT 


2003/11/26 


15 


:45 






or ("6473902") or ("5778187") or 














("6185625") or ("6332147") or ("6522342") 














or ("6449365") or { "6222530") ). PN. 










20 


2 


(("6452609") or ("6289380") ). PN. 


US PAT 


2003/11/26 


15 


:54 


21 


1 


(({"6452609") or (" 628 9380 ")). PN . ) and 


US PAT 


2003/11/26 


16 


:18 






encod$6 










22 


0 


6567796. pn. and encod$6 


US PAT 


2003/11/26 


16 


52 


23 


3831 


administrator$3 with user$3 


US PAT 


2003/11/26 


16 


48 


24 


1 


6452609. pn. and (digital$3 file$5) 


US PAT 


2003/11/26 


16 


48 


25 


2 


("6452609" "6567796") .pn. and (convert$6 


US PAT 


2003/11/26 


16 


56 






convers$6 encod$6) 










26 


112 


(media adj stream$5) with (encod$5 


US PAT 


2003/11/26 


16 


55 






digital) 










27 


0 


("6452609" "6567796") .pn. and (compress$5 


US PAT 


2003/11/26 


16 


58 






packetiz$6) 










28 


0 


("6452609" "6567796") .pn. and (packet$5 


US PAT 


2003/11/26 


17 


30 






header$5) 










29 


2 


hypertext with (digital adj file$3) 


US PAT 


2003/11/26 


17 


02 


30 


57 


hypertext near8 (digital ) 


US PAT 


2003/11/26 


17 


02 


31 


*1 


("6452609" "6567796") .pn. and {(window 


US PAT 


2003/11/26 


17 


43 






media real) adj player$3) 










32 


2 


("6452609" "6567796") .pn. and (web page$5) 


US PAT 


2003/11/26 


17 


21 


34 


3 


{ (media voice audio) adj encod$5) with 


US PAT 


2003/11/26 


18 


35 






(web adj page$5) 










35 


5 


( (media voice audio) adj encod$5) with 


US PAT 


2003/11/26 


17 


27 






(web ) 










36 


2 


("6452609" "6567796") .pn. and (hyperlink$5 


US PAT 


2003/11/26 


18 


46 






link$3 url) 










37 


304 


manual near3 encod$5 


US PAT 


2003/11/26 


18 


11 


38 


5 


(manual near3 encod$5 near9 (voice audio 


US PAT 


2003/11/26 


18 


05 






media multimedia) ) 










39 


1 


(manual near3 encod$5 ) and 6053415. pn. 


US PAT 


2003/11/26 


17 


39 


40 


2 


("6452609" "6567796") .pn. and { window 


US PAT 


2003/11/26 


17 


44 






media real player$3) 










41 


2 


("6452609" "6567796") .pn. and { multimedia 


US PAT 


2003/11/26 


17 


45 






audio video ) 










42 


2 


("6452609" "6567796") .pn. and 


US PAT 


2003/11/26 


17 


46 






(administrat$6 manager$5 central$6) 










43 


0 


(manual near3 encod$5) with (download$5) 


US PAT 


2003/11/26 


18 


06 


45 


64 


manual adj encod$5 


US PAT 


2003/11/26 


18 


24 


46 


130 


(web adj server$5) with gui 


US PAT 


2003/11/26 


18 


24 


47 


5 


{ (media real quicktime ) adj player$3) 


US PAT 


2003/11/26 


18 


39 






with (stop start) with download$6 










48 


19 


( (media real quicktime ) adj player$3) 


US PAT 


2003/11/26 


18 


39 






with (stop start) 










49 


1 


("6452609" "6567796") .pn. and (download$6) 


US PAT 


2003/11/26 


18 


46 


- 


1 


("5835717") .PN. 


US PAT 


2003/11/26 


15' 


43 


- 


3 


{("6434680") or ("6489979") or 


US PAT 


2003/11/26 


13 


38 






("6516356") ) .PN. 












723 


(graphical adj user$3 adj interface$5) 


US PAT 


2003/11/26 


13 


39 






with browser$5 












93 


(graphical adj user$3 adj interface$5) 


US PAT 


2003/11/26 


13 


39 






with encod$5 
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Electronic Edition of the Midrash Pirqe Rabbi Eliezer 

Creating an Encoding Manual 

Lewis M. Barth 

INTRODUCTION 

This paper will deal with several structural and encoding issues encountered in the process of creating a manual for 
encoding an electronic edition of Pirqe Rabbi Eliezer (Pirqe R. EL), the Chapters of Rabbi Eliezer. Pirqe R. El. is a 
midrashic retelling of significant aspects of the biblical narrative, from the creation story through the Book of Esther 
was written in the Land of Israel probably during the eighth century CE, i.e., in the early Muslim period. It contains 
references to Aisha, the wife of Mohammed and to Fatima, his daughter. Pirqe R. El. seems to be a kind of narrative 
reader's digest of rabbinic traditions on the biblical text. In fact it contains strong echoes of material known only from 
Old Testament Pseudepigrapha, as well as mystical and astrological material The language of Pirqe R. El. is Hebrew 
with a few non-Hebrew loan words in transliteration. 



Pirqe R. El. was exceedingly popular in medieval and pre-modern traditionalist Jewish literary circles. It is preserved 
more than twenty complete manuscripts containing fifty -two to fifty-four chapters and more than seventy-five partia 
manuscripts and fragments. In addition, over thirty printed editions of Pirqe R. El. have appeared since the sixteenth 

century. Recently, scholarly interest in Pirqe R. El. has focused on literary, historical and interpretative issues.- 

There is no scholarly edition of this text in the modern sense of this term. An electronic text of Pirqe R. El. exists and 
commercially available.- However, the present e-text is simply the encoded version of a semi-critical eclectic edition 
which appeared in the 1940 f s. That edition, originally prepared by a scholar named Michael Higger, is based on three 
manuscripts whose relationship has not been fully determined. In both Higger's version and its electronic copy, there 
no mark-up. This includes the fact that there are no references indicating the source of hundreds of citations from 
numerous biblical or rabbinic passages which are found in the text. 

The initial goal of this project was to create a critical edition of Pirqe R. EL The goal has now expanded to include 
electronic publication of all Pirqe R. El. manuscripts and fragments in two forms: digital facsimiles and transcription 
with hypertext links. There are two reasons for this: 1) the quantity of textual material and 2) recent hypotheses 
regarding the development of medieval Hebrew manuscripts which argue that each manuscript of a work is a 

completely new literary creation.- Thus the need to present visually a representation of each manuscript (at least the 
major ones), with the possibility of comparing complete readings of specific passages on the fly - something possib 
only electronically. 



This paper will elaborate on some matters raised in the Introduction and concentrate on technical areas necessary for 
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preparing an Encoding Manual for ffl^iroject-both to remind me of what I dewed to do and guide others who may 



SGML/TEI ELEMENT and ID ATTRIBUTE DESIGNATIONS FOR 
DIVISIONS of the TEXT 



The first issue emerges from document analysis. It concerns the simple question: as this is a "prose" document, how 
should the text of Pirqe R. El. be divided?- Such a decision is related to the choice of representing either the units of 
meaning (Chapters, Paragraphs) or the physical makeup of a manuscript (Pages and Lines). As the markup will be in 
SGML/TEI, units of meaning might be designated by the elements 



1 . DIV (division) in which: 

1 . The attribute TYPE would contain "chapter. " 

2. The attribute ID would contain 1) an abbreviation for the name of the work (PRE), 2) the database numb 
of the manuscript, and 3) the specific chapter number. 

2. P (paragraph) in which 

1. The ID attribute would contain 1) an abbreviation for the name of the work (PRE), 2) the database numb 
of the manuscript, and 3) the specific chapter number, and 4) the specific paragraph number. 



Alternately, the physical make up of the text could be represented by the elements: 



1. DIV in which: 

1 . The attribute TYPE would indicate "folio." 

2. The attribute ID would contain 1) an abbreviation for the name of the work (PRE), 2) the database numb 
of the manuscript, and 3) the specific folio number with side designated "a", "b", "c", or "d". 

2. L (line) in which: 

1 . The ID attribute would contain 1) an abbreviation for the name of the work (PRE), 2) the database numb 
of the manuscript, and 3) the specific folio number with side designated "a", "b", "c", or "d" and 4) the 
specific line number. 



Two problems emerge regarding text division, both having to do with SGML/TEI limitations and neither unique to t 
project. First: the problem of overlapping hierarchies. It is not presently possible to do concurrent markup, that is, to 
simultaneously tag material units of meaning and physical layout. Second, the TEI L tag is reserved for a line in poe 
not a physical line of a prose manuscript, i.e., it encloses a unit of meaning which is contained in a physical line even 
when the meaning may run on to the next line. 

The way around this is through the use of various MILESTONE elements: MILESTONE, PB (page break), and LB 
(line break) which contain attributes to indicate divisions in the text, but cannot contain text. 

In regard to encoding manuscripts of Pirqe R. EL, or any rabbinic text, after long evaluation, I have concluded that t 
basic initial encoding must be in units of meaning. Rabbinic units of meaning contain quotes - primarily from the 
Hebrew Bible — which may flow through two or occasionally three lines in a manuscript. In the present state of 
software development, it is not possible to place an opening QUOTE tag within a line enclosed by an L tag, and then 
place its closing QUOTE in another line. Thus one is forced to use units of meaning to divide the text and then insert 
MILESTONE tags to indicate page and line breaks for each separate manuscript. 



CANONICAL REFERENCE SCHEME 



One further comment regarding encoding Pirqe R. El. using units of meaning. The manuscripts, printed editions and 
electronic version of Pirqe R. El. divide the text by chapters, but do not contain paragraph divisions. Modern 
translations-in English, French and Spanish-do separate the text into paragraph units, but do not provide a reference 
scheme, beyond the page numbers of the particular translation. In sum, there is no agreed upon "canonical" reference 
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The only folly developed canonical reference system is found in the electronic text created by the "Academy of the 
Hebrew Language" for its "Historical Dictionary of the Hebrew Language". This electronic edition is based on one 
primary manuscript selected for its linguistic properties- New York, JTS Enelow 886 (Yemen, 1654), corrected 
against four others. Consequently, it either contains material — therefore "paragraphs" - which are only found in 
manuscripts of the same family, or does not contain material - therefore "paragraphs" - which are found in 
manuscripts of a different family. Nevertheless, the AHL numbering will generally be used to establish a canonical 
reference system, though it may be revised as encoding of the different manuscripts proceeds. 

ABBREVIATIONS 

As far as I can determine, this is the first SGML/TEI editing project of a medieval Hebrew work. Consequently, the 
issue of abbreviations and references of all kinds in the electronic context needs to be addressed. Printed editions, an 
especially translations of Pirqe R. El. contain notes and index references to the Bible (Hebrew Bible, LXX or NT), 
Apocrypha, Pseudepigrapha, the Dead Sea Scrolls, Rabbinic Literature, and the Church Fathers. Numerous modern 
scholarly publications (books, journals, etc.) contain references to Pirqe R. El. as well. 

Several questions and issues have emerged in regard to abbreviations and references. 

First, the text is in Hebrew. Consequently, when a source is originally in Hebrew, should references contain Roman 
Hebrew characters for titles of books? 

Second, because of the differences in character representation between print media and electronic media, standard 
listings of abbreviations and references cannot always be used, or need to be modified. For example, references to 
rabbinic tractates in some abbreviation systems use scholarly transliteration. This includes diacritical marks, among 
which are superscript half circles to represent the Hebrew letters ALEF and AYIN at the beginning of words. Even if 
opening and closing parentheses are substituted for these signs, searching mechanisms don't particularly like them, o 
require that they be differentiated from code. 

Third, the study of biblical literature is, of course, international. Western systems which provide references to biblica 
books are often reflective of different cultural traditions and can even differ within the same language. For example, 
English language countries verses from the biblical prophet Isaiah are often referenced in the following ways: Isa an 
(with or without a period). In German, this prophet's name is Jesaja, and referenced Jes; in French, the same prophet 
Esae, and referenced Es. The tendency in recent scholarly abbreviation of scriptural and related titles is not to includ 
period after the book reference. Thus, Isa 1 :5. The space after the name and semi-colon between the chapter and vers 
work well in printing, but not in an electronic context. 

Finally, even in so-called standard works, such as the Bible, differences exists in verse numbering (i.e., various editio 

of the HB, NT or LXX).- Thus, it becomes necessary to indicate the specific edition of the work in a bibliographical 
note. 

How does one proceed without reinventing the wheel? First, by choosing existing standards and indicating where 
modification is necessary. Second, by exploiting the advantages of electronic search mechanisms, one of which is to 
use the period as a delimiter, setting off parts of a reference. 

The language for the scholarly notations and tagging of this project will be English, and the reference standard that o 
the American Academy of Religion/Society of Biblical Literature, as found in SBL: Membership Directory and 
Handbook, 1994, pp. 224-240. 2 However, superscript for "ALEF" and "AYIN" as well as other diacritic marks for 
rabbinic texts are omitted. Where the reference contains two words, no space should be placed between the words; 
ex."Ros Has" =<Rosh Hashanah> would appear "RosHas." If at all possible, each source reference should be compo 
of four parts, each part separated by a period; ex. "HB.Gen.20.15. M ^ The first part represents the general body of 
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literature (HB=Hebrew Bible), the se^md the specific text (Gen=Genesis), the^Hrd either the chapter (20=chapter 2 
or the folio, the fourth either the verse (15=verse 15) or the column. 



Reference examples: 



• HB.Gen.20.15=Hebrew Bible, Genesis 20:15. 

• LXX.Gen.20. 1 5=Septuagint, Genesis 20: 1 5. 

• NT.Matt.5.6=(Greek)New Testament, Matthew, 5:6. 

• Rab.mAbot.l.3=Rabbinic Literature, Mishna, Abot 1:3. 

• Rab.bBer.25.a= Rabbinic text, Babylonian Talmud, Berakot 25a. 

• QL. lQapGen. 1 .3=Qumran Literature, Genesis Apocryphon from Qumran Cave 1 . 

Note that there should be a period even prior to the page or folio in a Talmud reference. 

Such references are to be encoded as text (CD AT A) as in the following example :- 
<CIT><BIBL><ADD>HB.Gen. 1 . 1</ADD></BIBL><Q>BRAWYT BRA ALHYnK/Qx/CITX 
This form of encoding canonical references in the text itself allows the following to happen: 



Searches can be created for 



• The body of literature (HB) 

• The body of literature (HB), and the specific biblical book (Gen). 

• The body of literature (HB), the specific biblical book (Gen) and the chapter (1). 

• the body of literature (HB), the specific biblical book (Gen) the chapter (1) and the verse. 



In addition: 



• The entire canonical reference enclosed in ><BIBL><ADD>HB.Gen.l.l</ADD></BIBL> can be hidden or 
visible. 

• The responsibility for the identification of the reference can be designated in the RESP attribute of the elemen 
ADD. 

• It would be easy to replace the English reference scheme for biblical or rabbinic books and with proper softwa 
reverse the visual presentation of the reference, with numbers retained in proper order (HB.Gen.1.1 -> 
l.l.ARB.KNT). 



CONCLUSIONS 



SGML/TEI markup is particularly useful for scripturally based text, i.e., texts from the vast literatures of Judaism, 
Christianity and Islam which frequently cite biblical or koranic verses. There are numerous genres in these religious 
literatures (exegetical works, homilies, scriptural essays, dialogues, legal texts, liturgical texts, religious poetry, etc.) 
They all have in common the citation of texts sacred to a religious community, the frequent mention of characters, 
places and institutions found in such texts, plus references to later individuals, places and institutions. In addition, th 
texts are often macaronic, i.e., they contain more than one human language. 

Such texts offer particular problems for electronic presentation, apart from the issues of the non-existence of SGML 
software for viewing correctly encoded Near-Eastern languages. This paper has focused on technical issues, the 
solution of which will be indicated in an Encoding Manual used both as a supplement to viewing the electronic text 
and as a guide for those participating in the encoding process. 
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(An earlier version of this paper was presented at the ALLC-ACH 1996 Joint International Conference, Bergen, 
Norway, June 1996.) 

h See the numerous articles cited in notes by Jacob Elbaum, "Rhetoric, Motif and Subject-Matter: Toward an Analy 
of Narrative Technique in Pirke de-Rabbi Eliezer, " Jerusalem Studies in Jewish Folklore, XIII-XIV, (1992), 99-126. 
Pirqe R. El. was translated into Latin by the sixteenth century: R. Eliezer f. Hircani: Liber sententiarum Judiacarum, 
trans. Konrad Pellikan, (1546) [see comment on this by Hans Jakob Haag, Pirqe DeRabbi Eli'ezer Kap. 43, 
Magisterarbeit, Koln, 1978. In the twentieth century Pirqe R. Eliezer has been translated into English, French and 
Spanish: Pirke De Rabbi Eliezer, trans. Gerald Friedlander, (London, 1916; reprint, New York: Hermon Press, 1965) 
Pirqe De Rabbi 'Eliezer: Lemons De Rabbi Eliezer, trans. Marc-Alain Ouaknin, Eric Smilevitch and Pierre-Henri 
Salfati. Paris, (Verdier, 1984); and Los Capitulos De Rabbi Eliezer, trans. M. Perez Fernandez, (Valencia, 1984). 

2, Bar Ilan Database (Responsa Database, Bar Ilan University); STM Database (Polytext, Jerusalem) and Davka 
database of Rabbinic Literature. 

i. For the debate on this issue between Schafer and Milikowsky, see: Milikowsky, Chaim, "The Status Quaestionis o 
Research in Rabbinic Literature." JJS 39, no. 2 (1988): 201-21 1 and Schafer, Peter. "Once Again the Status Quaestio 
of Research in Rabbinic Literature: An Answer to Chaim Milikowsky." JJS 40, no. 1 (1989): 89-94. In addition, 
Malachi Beit-Arie has approached the same question from the perspective of codicological issues. See: Malachi Beit 
Arie, "Transmission de textes par scribes et copistes. Interferences inconscientes et critiques", Les problemes poses p 
l'edition critique des textes anciens et medievaux ('Louvain-la-Neuve, 1992), 175. 

4 Peter Robinson has written, "perhaps the most important decision an encoder of scholarly text must face is how th 
text should be divided (Transcription, p. 64)." 

5. Traditional citing most often utilizes the pagination of the edition of the RaDaL, the page division of the edition of 
Higger, or occasionally reference to the "critical edition" of C. M. Horowitz. The problems of all these texts will be 
discussed in a separate document "Introduction: the Need for a Critical Edition of Pirqe R. El." 

(L My thanks to Robin Cover for reminding me of this. 

7. For abbreviations of journals, etc., additional items are found in the Index of Articles on Jewish Studies, (The Jew 
National and University Library: Jerusalem, 1995 and earlier), "List of Periodicals and the Collections and their 
Abbreviations," and International Glossary of Abbreviations for Theology and Related Subjects, ed. Sigfried 
Schwertner (Walter de GruytenBerlin and New York, 1974). 

8. The MILESTONE tag LB (line break) will also use a four part structure for the ID attribute. Example: PRE.04.26b 
This refers to the work Pirq. R. El.; manuscript 04 (so designated in manuscript databaseO); folio 26b (+ a = rechto o 
= verso); line 1 . 

9. My thanks to Lou Burnard for suggesting this tagging and encoding for canonical references. In the published 
abstract of this paper I had stated: "Such references are to be used in the attribute "N" for QUOTE and in various 
notation and bibliographical elements." It soon became clear to me that it is not possible to do the kind of searching 
indicated above if the canonical reference was placed within an element attribute. Once I had utilized this CIT schem 
the additional advantages for visibility, identification of responsibility and ease of revision became clear. 
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Berlin, Stand: November 1991. 
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